Introduction Knowledge of actual evapotranspiration is valuable for assessing water availability in policy and decision-making of water resources and agriculture. Despite all improvements, the measurement of actual evapotranspiration is accompanied by difficulty in some locations. In this regard, an accurate method for actual evapotranspiration estimation is linked to the reference evapotranspiration (ETo) determination as a significant component. The Food and Agriculture Organization of the United Nations (FAO) Penman-Monteith method is widely recognized for its high accuracy and making it a globally accepted standard. Despite the acceptability of the FAO Penman-Monteith method, the need for a large amount of reliable weather measurements, such as solar radiation and wind speed, has challenged the method. These data are often not available in developing countries, and the issue is related to the limited number of equipped meteorological stations or inaccuracies of measurement. Therefore, the need for an alternative ETo method seems necessary, and the efficient artificial intelligence techniques with a low number of input data can obtain accuracy equal to the FAO method. In this regard, the preprocessing step with a selection of important input data is more important. This study introduces a novel approach by systematically comparing multiple preprocessing methods for ETo estimation by integrating decision making techniques to improve data selection and model accuracy. The preprocessing methods belong to the correlation concept, regression analysis, and decision making approach, with different normalization methods. To increase the accuracy of decisions, more than one evaluation criteria were considered in the analysis.Materials and Methods The analysis of this study is focused on eleven stations (1992-2021). The station's spatial distribution consists of the North, West, North-West, East, and center of Iran. The preprocessing step in the modeling process has great importance in deriving the effective and precise factors as the input data. Several preprocessing methods were investigated in this study to identify the dominant input data for ETo estimation. They include the Pearson correlation coefficient, Kendall’s tau-b correlation coefficient, standardized Beta coefficient, stepwise regression, Shannon’s entropy, and simple additive weighting with fuzzy normalization. These methods were selected for their ability to assess important variables with data analysis from different aspects by correlation detection and data normalization, ensuring accurate ETo estimation. The Pearson correlation coefficient can distinguish the correlation between independent and dependent variables; higher values indicate higher dependency. The emphasis of stepwise regression is on the best and most impressive variables from a large set of variables. Decision making is not always between two options, and sometimes we have to make the right selection among several options. In this case, a multi-criteria decision is made, depending on the sensitivity of the problem, for which certain methods can help to reach the best option. Some methods are illustrated to solve MCDM problems, such as Shannon’s entropy. The process of entropy analysis is to assign the weights of the objective criterion. The assumption of entropy analysis is the importance of data with high-weight indicators relative to the data with low-weight indicators. The regression analysis aims to minimize the error between observed and forecasted values; this matter can be possible by SVR, which used as the model in this study. Results and Discussion The maximum Pearson correlation coefficient in the monthly scale is related to the solar radiation, maximum and minimum temperature in all stations. This matter was preserved by τ Kendall correlation coefficient. The derived meteorological data in the stepwise regression at the annual scale can be described as the relative humidity, wind speed, solar radiation, maximum temperature in Maku, wind speed, maximum temperature, solar radiation, sunshine hours in Yazd. Decision making analysis needs some criteria, and five criteria, RMSE, R, MAE, NSE, and GMER, were applied in Shannon’s entropy method. The selected are used to find the best solution from all data (Tmax, Tmin, RH, U, S, and R), and different combinations of data. The combination 3-7; the number of input data is equal to 3, and the data are wind speed, solar radiation, and sunshine hours, has the highest weight, pink in Maku. In the monthly scale and the combination with five input data, the RMSE of all stations related to Shannon’s entropy is higher than fuzzy normalization, except Mashhad with the same RMSE in the two methods, and Zanjan and Yazd with a low error of Shannon’s entropy. In two scales, the performance of fuzzy normalization is in a good state. In the annual scale, the Pearson correlation and stepwise regression have the same function. In the monthly scale, stepwise regression has poor performance. The selection of input data based on fuzzy normalization could decrease the error of the simulation. Conclusion The results indicated that the normalization process had better performance in the preprocessing method based on the MCDM approach relative to the other methods. The average of the criteria showed that the best method has no limitations regarding to the three types of different climates, wet, semiarid, and arid, and the fuzzy normalization had good performance. This method has no geographical limitation. Determining an efficient method for the preprocessing step has an acceptable response in all climates, which is one of the strengths and innovations of the research. One of the things that can strongly affect the preprocessing method based MCDM approach is the type of decision making method. In the decision making problem, the used method for normalization of the decision matrix has high importance in information extraction. In general, maximum temperature, relative humidity, wind speed, solar radiation, sunshine hours (annual), and minimum temperature (monthly) were introduced as the effective data. The reason for the better performance of certain data combination is related to the high dependency of these combinations with ETo variation.Generally, using the exact method as the preprocessing step in each climate based on the data capabilities of area and selection of the effective data can upgrade the efficiency of ETo estimation. It can led to the precise determination of water availability and strong policymaking in irrigation planning, agricultural studies.